Answers to questions about biostatistics

(that you’ve never even asked)

Philip Boonstra

2025-05-12

Things to talk about

  • How I got to what I do today

  • What I do on a daily basis

  • Hands-on R project

  • Closing thoughts

High School (1998-2002) / College (2002-2006)

  • Took AP Calc but AP Stats not offered in my HS

  • Double major in Mathematics / Political Science in college

  • Took two courses in Mathematical Statistics my senior year

  • Summer research project in Biostatistics after graduation

What is Biostatistics?

  • Application of statistical science to questions related to biology and public health, e.g. 

    • What is the impact of smoking during pregnancy on the growth trajectory of an infant?
    • Does this drug help cancer patients live longer?
    • Which genes are over-expressed in people who have Type II diabetes?
    • How does excess mortality due to Covid vary across countries?

Graduate school (2007-2012)

  • Studied Biostatistics at UM

  • Lots of course work in MS program (~12 Biostatistics or Statistics courses / ~8 courses in epidemiology, genetics, and public health)

  • Additional year of courses for PhD, then research and writing a dissertation

After graduate school (>2012)

  • Professor at UM

  • My days / evenings consist of mix of:

    • teaching classes

    • working and researching with graduate students

    • doing research

    • writing R code

Some different projects I do: ECMO research

J Am Med Assoc. 2019;322(6):557-568.

ECMO for Covid

  • Patients with confirmed SARS-CoV2 infection, ages 16 years or older, started ECMO between Jan 16, 2020 and May 1, 2020

  • 1035 patients from 213 hospitals across 36 countries

  • Estimated 37% in-hospital mortality 90 days post-ECMO

The Lancet. 2020;396:1071-1078

Cancer

Cancer drugs often harm the patient as well as the disease

When new drugs are developed or new combinations are proposed, the right amount to give the patient is not known: too little may not be efficacious against the cancer and too much may be toxic for the patient.

First-in-human trials will gradually increase the dose level of drug.

J Natl Cancer Inst. 2009 May 20; 101(10): 708–720.

Technical research

I sometimes get to write theorems:

Biostatistics (2013); 14(2): 259–272

Generative AI

ChatGPT lies. Boldly.

“While ChatGPT can write credible scientific essays, the data it generates is a mix of true and completely fabricated ones.”

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9939079/

ChatGPT loves fluffy words

  • “Our results indicate that the largest and fastest growth was observed in Computer Science papers, with [the proportion of LLM-modified sentences] reaching 17.5% for abstracts and 15.3% for introductions by February 2024”

https://arxiv.org/pdf/2404.01268

Misconceptions about statisticians I encounter

  • Statisticians are only p-value machines and sample size calculators

  • Statisticians can’t help with study design or data collection

  • Statisticians don’t care about the science

  • More data is always better

(Bio)statisticians are always in demand

R project

  1. Respond to google form: https://tinyurl.com/salineapstats

  2. Go to posit.cloud

  3. Create a free account

  4. https://posit.cloud/content/4056824

Things you can do if you want to be a biostatistician

  • Go to college:

    • If you want to stop with a bachelors, consider majoring in statistics or data science

    • If you are thinking about graduate school, consider majoring in mathematics

    • Hone written and oral communication skills

  • Summer programs in Biostatistics

    • https://sph.umich.edu/bdsi/
  • Lots of free resources for learning to code:

    • https://www.coursera.org/specializations/data-science-for-health-research

    • https://www.coursera.org/learn/r-programming

    • https://swcarpentry.github.io/r-novice-gapminder/